Conversation
… and embeddings The SCIP import path pushed pre-computed symbols/occurrences/edges via Delta but the daemon silently dropped them in multiple places: - Journal wrote empty text → symbols lost on daemon restart - upsert_file_precomputed ignored CPG edges → blast-radius broken - stale_files hashed empty text → infinite Merkle re-sync loop - file_source_text returned "" → embeddings, stream_context, and explain-match all failed for imported files Fix: FileInput now carries a precomputed flag and content_hash. JournalEntry::UpsertFilePrecomputed persists symbols, occurrences, and edges so they survive compact + replay. stale_files uses the stored content_hash. file_source_text falls back to disk for precomputed file:// URIs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- session.rs: explain why Tier 2 verification is skipped for pre-computed SCIP imports (source_opt is None by design, SCIP emitters are authoritative) - export.rs: document that SCIP round-trips lose CPG edges since the SCIP wire format has no edge representation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on, CI workflow Five improvements in one batch: 1. SCIP integration test — end-to-end test proving pre-computed symbols from Delta are searchable via WorkspaceSymbols and resolvable via QueryDefinition (regression coverage for the import path fix) 2. Proto fix — Relationship.is_override → is_definition to match upstream SCIP field 5 semantics; export mapping updated accordingly 3. SCIP CI action — reusable GitHub Actions workflow (.github/workflows/scip-import.yml) that runs a SCIP indexer (rust/typescript/python), starts a LIP daemon, and pushes the index at confidence 100 4. Tier 2 test harness — 14 unit tests for the verification manager: routing dispatch, channel backpressure, confidence elevation, symbol upgrade merging, backend unavailability 5. Name-dep invalidation — new invalidated_files_for() query answering "which files break if these symbols change" using the existing file_consumed_names index; wired into the daemon protocol as QueryInvalidatedFiles / InvalidatedFilesResult Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New protocol message that computes blast radius for all symbols defined
in the given changed files in one call. When min_score is present,
each file's embedding is compared against the index and neighbours
above the threshold are returned as semantic_items with a source tier
(static / semantic / both).
Designed for CKB's BlastRadiusEnricher: one round-trip prefetch in
reviewPR, static callers stay authoritative for thresholds, semantic
callers are advisory with per-item confidence.
Wire format:
→ query_blast_radius_batch { changed_file_uris, min_score? }
← blast_radius_batch_result { results: [EnrichedBlastRadius] }
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…changelog - LIP_SPEC.mdx §8.1.1: batch blast radius with semantic enrichment, symbol kind filtering rationale, embedding scope note (file-level today, per-function when chunked embeddings land) - daemon.mdx: add QueryBlastRadiusBatch to protocol message table - CHANGELOG.md: document all unreleased changes (SCIP fixes, journal persistence, name-dep invalidation, blast radius batch) - db.rs: filter blast_radius_batch to Function/Method/Class/Interface/ Constructor/Macro kinds; add embedding scope comment Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…yAbiHash, Tier1.5, backoff
Tier 1:
- NearestItem.embedding_model: per-hit model provenance on all nearest-neighbour results
- blast_radius_batch: symbol-level semantic enrichment when symbol_embeddings available;
SemanticImpactItem.symbol_uri now non-empty at function granularity, falls back to file-level
- ReindexStale { uris, max_age_seconds } → ReindexStaleResult { reindexed, skipped }:
atomic check-then-reindex replacing the QueryFileStatus → ReindexFiles race
Tier 2:
- BatchFileStatus { uris } → BatchFileStatusResult { entries: Vec<FileStatusEntry> }:
multi-file status in one round-trip, batchable
- Tier 2 backoff recovery: all 8 LSP backends recover from crashes with exponential backoff
(2–300s); permanently disabled only after 8 consecutive failures (BackoffState struct)
Tier 3:
- QueryAbiHash { uri } → AbiHashResult { uri, hash }: SHA-256 over exported symbol surface,
stable recompilation trigger (batchable); Kotlin IC-style ABI fingerprinting
- LipDatabase::run_tier1_5_inference(): Datalog fixed-point loop — callee elevation when all
callers ≥ 80 confidence, exported-leaf +5 bump; ceiling 65 (Tier 1.5 level)
All new variants wired into variant_tag, supported_messages, is_batchable, and the
BatchQuery sync handler. 313 unit tests + 14 integration tests green, clippy clean, fmt clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EnrichedBlastRadius gains file_uri so callers can trace results back to their input file. BlastRadiusBatchResult gains not_indexed_uris (skip_serializing_if empty, back-compat) to distinguish "URI not in index" from "URI has zero callers" — previously both were silent empty. blast_radius_batch now checks file_inputs before calling file_symbols. file_symbols guards against the cold-cache path for precomputed files: if sym_cache is cold and file is marked precomputed, return [] instead of falling through to Tier 1 parsing on empty text. Three new tests: blast_radius_batch_not_indexed_uris_reported, blast_radius_batch_file_uri_populated, file_symbols_precomputed_cold_cache_returns_empty. Docs: v2.1 + v2.2 roadmap sections added to spec.mdx and LIP_SPEC.mdx; ReindexStale/BatchFileStatus/QueryAbiHash added to daemon message table. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ships 5 additive features so CKB can retire its duplicate SCIP parser. Protocol version stays at 2; every new field uses serde defaults + skip_serializing_if. Drift-guard test covers supported_messages and variant_tag for the two new client messages. #1 Rich symbol metadata — signature_normalized, modifiers, visibility + visibility_confidence, container_name, extraction_tier, modifiers_source on OwnedSymbolInfo. Tier-1 populates the structural fields; SCIP importer derives modifiers via prefix-parse and uses upstream-compatible enclosing_symbol=8. #2 Reference classification — ReferenceKind (Unknown/Call/Read/Write/ Type/Implements/Extends) + is_test on OwnedOccurrence. Tier-1 classifier uses tree-sitter parent/field lookup; SCIP import/export maps to SymbolRole::Read/WriteAccess and Test bits. #3 QueryBlastRadiusSymbol — single-symbol wrapper around blast_radius_for_symbol with semantic enrichment; returns None for unknown or unindexed symbols. #4 QueryOutgoingCalls — forward call-graph BFS. New caller_to_callees index mirrors the reverse map, populated in upsert paths and cleaned in remove_file_call_edges. Depth clamped [1,8]; NODE_LIMIT=200 with truncated flag. #5 Ranked workspace symbols — kind_filter, scope, modifier_filter on QueryWorkspaceSymbols; WorkspaceSymbolsResult gains ranked: Vec<RankedSymbol> with tiered scoring (Exact=1.0 / Prefix=0.8 / Fuzzy=0.5). Empty query preserves pre-v2.3 behavior (ranked=[]). 21 integration tests green; new coverage for every feature. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Converge client and daemon URI conventions and back-fill call edges when SCIP imports omit them. Fixes the "lip import prints success but every file shows indexed: false" bug reported against v2.3.0. - RegisterProjectRoot message + daemon canonicalizes lip://local/<rel> against registered roots (longest-first); capability advertised in HandshakeResult.supported_messages - EdgesSource provenance on EnrichedBlastRadius (Tier1 | ScipWithTier1Edges | ScipOnly | Empty) so CKB can route around files LIP has no structural edges for - upsert_file_precomputed reads the file from disk and runs tier-1 when the incoming SCIP document has empty edges - lip import emits canonical lip://local//<abs>/<rel> (or lip://local/<rel> when Metadata.project_root is absent), replacing the old file:///<rel> form that silently mismatched CKB queries - lip import --verify round-trips up to 10 sampled documents after push and exits non-zero on any mismatch Bumps workspace version 2.2.0 → 2.3.1; v2.3.0 features and v2.3.1 fixes ship in the same release. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Every session subscribes to the daemon's push-notification broadcast and
writes pending notifications back to its client after each response. Every
Delta{Upsert} also emitted an IndexChanged onto that same broadcast — so
the session wrote TWO frames per delta (DeltaAck + self-emitted
IndexChanged) while `lip import` read ONE frame per iteration. Frame
production ran one frame ahead of consumption; after ~65 deltas the 8 KB
macOS AF_UNIX send buffer filled, write_message parked mid-frame,
read_message never ran, both processes idle at 0% CPU.
Fix: tag every broadcast message with the emitting session's id
(Notification { source_session: Option<u64>, message: ServerMessage }) and
have the drain loop skip envelopes whose source_session matches its own.
Tier 2 upgrades emit with source_session=None so they still reach every
session. LipDaemon holds an AtomicU64 and assigns a fresh id per accept.
Regression test daemon_bulk_precomputed_import_does_not_deadlock pushes
200 precomputed deltas through a single session and fails fast if any
IndexChanged echo reaches the client. Verified: test fails at delta 1
without the filter, passes in 60ms with it.
Latent since v2.2.0 when IndexChanged-on-every-upsert landed; surfaced
only now because the v2.3.1 URI fix let CKB imports run long enough to
hit the 8 KB buffer wall (836-doc SCIP bundle froze at ~130).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… + path-traversal guard) Five correctness fixes discovered after v2.3.1 shipped and CKB began consuming EnrichedBlastRadius end-to-end. Wire-compatible via #[serde(flatten)] on BlastRadiusResult; protocol_version stays at 2. Changed: - edges_source moved from EnrichedBlastRadius onto BlastRadiusResult so non-enriched QueryBlastRadius carries call-edge provenance too; JSON shape unchanged. Fixed: - Tier-1 back-fill URIs now translate to SCIP descriptor form, both same-file (via display_name map) and cross-file (via name_to_symbols index with single-match guard). Symbol URIs no longer blank on wire responses for CKB dedup. - Path-traversal guard in convert_document rejects SCIP documents whose relative_path escapes the project root under string-level normalization — stops Go build-cache artefacts leaking into the graph. - Double lip://local/ prefix in callee_to_callers keys: lip_uri now detects an existing lip://local/ prefix when the back-fill replays tree-sitter against a canonical-URI file. - SCIP-descriptor vs tier-1-identifier mismatch in callee_name_to_callers: new normalize_callee_name(fragment) strips trailing () / . / : / # at all four insert sites plus the BFS lookup, so SCIP and tier-1 callees share keys. Added: - LIP_DEBUG_EDGES=1 diagnostic gating for upsert_file_precomputed, Phase-2 BFS, and the wire serializer. Wire log reports has_edges_source / body_bytes / 500-char head — truncation-free. Tests: +normalize_callee_name_strips_scip_descriptor_suffixes, +edges_source_survives_all_response_envelopes, +tier1_backfill_translates_caller_uri_to_scip_fragment, +tier1_backfill_resolves_cross_file_callee_via_name_index. 409 unit + 26 integration + 44 lip-cli all green. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iagnostic Bug D (CKB testdrive follow-up): when the tier-1 back-fill resolver's `translate` and `name_to_symbols` indexes both miss for a caller name, the back-fill preserves the raw tier-1 URI (`lip://local//<abs>#<name>`) as the caller in `callee_to_callers`. `def_index` was never populated for that URI — only SCIP occurrences register there — so Phase 3 of `blast_radius_for` skipped every such caller and Phase 4 emitted 100% blank `symbol_uri` in the CKB testdrive. Phase 3 now falls back to deriving the file URI by stripping the `#<name>` fragment when `def_index` misses and the caller URI carries the `lip://local/` scheme, using the caller URI verbatim as `symbol_uri`. No double-indexing required. Regression test imports a caller file with no SCIP symbols against an on-disk source so the resolver must miss, then asserts the ImpactItem carries the full tier-1 caller URI rather than a blank. Also split the LIP_DEBUG_EDGES `upsert_precomputed` log into `scip_pairs` / `tier1_pairs` — the previous `pairs=N` total was ambiguous between "N from SCIP" (→ ScipOnly) and "0 from SCIP, N from back-fill" (→ ScipWithTier1Edges), masking upstream SCIP producer drift as LIP regression. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…astRadiusSymbol) Additive RPC so CKB can query the forward call-graph direction with the same enriched envelope and edges_source provenance gating as blast radius. BFS over caller_to_callees, depth clamped 1..=8, NODE_LIMIT=200. Symmetric Bug-D-style #<name>-strip fallback on the callee side. Semantic enrichment via SemanticImpactItem with Static/Semantic/Both tagging (symbol embedding preferred, file embedding fallback). edges_source lives on OutgoingImpactStatic so CKB can apply the same EdgesSourceEmpty → skip fold gate. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three-tier resolution (slice URI prefix → SCIP package descriptor → language-appropriate manifest walk), resolved once at upsert time and stored on FileInput. Surfaces on every ImpactItem and SemanticImpactItem built by blast_radius_for / blast_radius_for_symbol / blast_radius_batch and outgoing_impact_for so CKB's cross-module risk classifier gets a useful grouping key instead of collapsing to ModuleCount=0. Manifest coverage: Cargo.toml, go.mod, package.json, pyproject.toml, setup.py, pubspec.yaml. Unsupported languages (C/C++/Kotlin/Swift/Java) return None. Field is #[serde(default, skip_serializing_if = None)], so the wire shape stays byte-identical for emitters that don't populate it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Forward-direction twin of v2.3.2's callee_name_to_callers. Index keyed by normalize_callee_name(extract_name(from_uri)); populated at all three edge-insertion sites (regular tier-1 upsert, SCIP pre-computed edges, SCIP-empty tier-1 back-fill) and pruned in remove_file_call_edges. outgoing_impact_for's BFS now consults both caller_to_callees (URI-exact) and caller_name_to_callees (name-bridge) on every hop, matching Phase 2 of blast_radius_for. Closes the asymmetry where QueryOutgoingImpact seeded from a SCIP descriptor URI (e.g. pkg#Engine#AnalyzeImpact().) returned empty direct_items because the tier-1 back-fill had kept the raw tier-1 caller URI when the method name was ambiguous across the codebase (translate-map miss + name_to_symbols multi-hit fallthrough). Regression test outgoing_impact_name_bridge_for_tier1_caller_uri. All 438 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pure CI hygiene — no behavioural change. Rust 1.95 added several clippy lints that trip on pre-existing idioms in the v2.3 codebase: - unnecessary_map_or: is_some_and over map_or(false, …) - unnecessary_sort_by: sort_by_key + std::cmp::Reverse - manual_pattern_char_cmp: .find(['@', '/']) over closure - cloned_ref_to_slice_refs: std::slice::from_ref for single-element slices Plus cargo fmt across 19 files to align with current rustfmt output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
Author
|
Superseded ny #24 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v2.3 release bundle — 9 commits on top of
main, spanning the CKB structural-parity work (v2.3.0) through the forward-direction name-bridge fix (v2.3.5).v2.3.0 — CKB structural-parity bundle. Rich symbol metadata (
signature_normalized,modifiers,visibility,container_name,extraction_tier,modifiers_source); reference classification (ReferenceKind+is_teston every occurrence);QueryBlastRadiusSymbol,QueryOutgoingCalls; ranked + filteredQueryWorkspaceSymbols.v2.3.1 — CKB import landing.
RegisterProjectRoot+ daemon-side canonical URI resolution;EdgesSourceprovenance on blast radius; tier-1 edge back-fill when SCIP imports carry none;lip import --verify; self-echo deadlock fix on bulk precomputed imports.v2.3.2 — CKB testdrive follow-up.
edges_sourcemoved ontoBlastRadiusResult; tier-1 back-fill URIs translate to SCIP descriptor form (same-file + cross-file); path-traversal guard on SCIP ingestion;callee_name_to_callersnormalisation; Phase-3 blank-symbol_urifallback.v2.3.3 —
QueryOutgoingImpact. Forward-direction twin ofQueryBlastRadiusSymbolwith the sameEnrichedOutgoingImpactenvelope,edges_source, and semantic enrichment viaSemanticImpactItem { source: Static | Semantic | Both }.v2.3.4 —
module_idon impact items.ImpactItem.module_id+SemanticImpactItem.module_idresolved once at upsert time (slice URI / SCIP package / manifest walk). Unlocks CKB'sRecomputeBlastRadius.ModuleCountfor non-sliced LIP-only traffic.v2.3.5 — Forward-direction name-bridge symmetry. New
caller_name_to_calleesindex mirrors v2.3.2'scallee_name_to_callersat every edge-insertion site;outgoing_impact_forBFS now consults both URI-exact and name-bridge indexes on every hop. Closes the asymmetry whereQueryOutgoingImpactseeded from a SCIP descriptor URI returned emptydirect_itemsfor name-overloaded methods.protocol_versionstays at2; every new field is#[serde(default, skip_serializing_if = …)]; every new message is advertised inHandshakeResult.supported_messages.Test plan
cargo test --lib— 438/438 passingcargo check— clean, no warningsblast_radius_phase3_fallback_for_tier1_caller_uri,outgoing_impact_phase3_fallback_for_tier1_callee_uri,outgoing_impact_name_bridge_for_tier1_caller_uri,normalize_callee_name_strips_scip_descriptor_suffixes,daemon_bulk_precomputed_import_does_not_deadlock,blast_radius_surfaces_module_id_from_scip_descriptor,blast_radius_surfaces_module_id_from_cargo_toml_walk,outgoing_impact_surfaces_module_id,edges_source_survives_all_response_envelopes,tier1_backfill_translates_caller_uri_to_scip_fragment,tier1_backfill_resolves_cross_file_callee_via_name_indexQueryOutgoingImpactreturns non-emptydirect_items(the v2.3.5 scenario)module_idsurfacing inRecomputeBlastRadius.ModuleCount🤖 Generated with Claude Code